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Abstract 

We establish a formal connection between the problem of characterizing degrees of freedom 
(DoF) in constant single-antenna interference channels (ICs), with general channel matrix, and the 
held of additive combinatorics. The theory we develop is based on a recent breakthrough result by 
Hochman in fractal geometry [2], Our first main contribution is an explicit condition on the channel 
matrix to admit full, i.e., K/2 DoF; this condition is satisfied for almost all channel matrices. We 
also provide a construction of corresponding DoF-optimal input distributions. The second main result 
is a new DoF-formula exclusively in terms of Shannon entropies. This formula is more amenable to 
both analytical statements and numerical evaluations than the DoF-formula by Wu et al. [3], which 
is in terms of Renyi information dimension. We then use the new DoF-formula to shed light on 
the hardness of finding the exact number of DoF in ICs with rational channel coefficients, and to 
improve the best known bounds on the DoF of a well-studied channel matrix. 

I. Introduction 

A breakthrough finding in network information theory was the result that K/2 degrees of 
freedom (DoF) can be achieved in A-user single-antenna interference channels (ICs) [4], [5]. The 
corresponding transmit/receive scheme, known as interference alignment, exploits time-frequency 
selectivity of the channel to align interference at the receivers into low-dimensional subspaces. 

Characterizing the DoF in ICs under various assumptions on the channel matrix has since become 
a heavily researched topic. A particularly surprising result states that K/2 DoF can be achieved in 
single-antenna A-user ICs with constant channel matrix [6], [7], i.e., in channels that do not exhibit 

The material in this paper was presented in part at the IEEE International Symposium on Information Theory, Honolulu, 
HI, June 2014 [1], 
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any selectivity. This result was shown to hold for (Lebesgue) almost all 1 channel matrices [6. Thm. 1], 
Instead of exploiting channel selectivity, here interference alignment happens on a number-theoretic 
level. The technical arguments—from Diophantine approximation theory—used in the proof of [6, 
Thm. 1] do not seem to allow an explicit characterization of the “almost-all set” of full-DoF admitting 
channel matrices. What is known, though, is that channel matrices with all entries rational admit 
strictly less than K/2 DoF [7] and hence belong to the set of exceptions relative to the “almost-all 
result” in [6]. 

Recently, Wu et al. [3] developed a general framework, based on (Renyi) information dimension, 
for characterizing the DoF in constant single-antenna ICs. While this general and elegant theory 
allows to recover, inter alia, the “almost-all result” from [6], it does not provide insights into the 
structure of the set of channel matrices admitting K/2 DoF. In addition, the DoF-formula in [3] is 
in terms of information dimension, which can be difficult to evaluate. 

Contributions: Our first main contribution is to complement the results in [3], [6], [7] by 
providing explicit and almost surely satisfied conditions on the IC matrix to admit full, i.e., K/2 
DoF. The conditions we find essentially require that the set of all monomial 2 expressions in the 
channel coefficients be linearly independent over the rational numbers. The proof of this result is 
based on a recent breakthrough in fractal geometry [2], which allows us to compute the information 
dimension of self-similar distributions under conditions much milder than the open set condition [8] 
required in [3]. For channel matrices satisfying our explicit and almost sure conditions, we furthermore 
present an explicit construction of DoF-optimal input distributions. The basic idea underlying this 
construction has roots in the field of additive combinatorics [9] and essentially ensures that the set-sum 
of signal and interference exhibits extremal cardinality properties. We also show that our sufficient 
conditions for K/2 DoF are not necessary. This is accomplished by constructing examples of channel 
matrices that admit K/2 DoF but do not satisfy the sufficient conditions we identify. The set of all 
such channel matrices, however, necessarily has Lebesgue measure zero. 

Etkin and Ordentlich [7] discovered that tools from additive combinatorics can be applied to 
characterize DoF in ICs where the off-diagonal entries in the channel matrix arc rational numbers and 
the diagonal entries are either irrational algebraic 3 or rational numbers. Our second main contribution 
is to establish a formal connection between additive combinatorics and the characterization of DoF 
in ICs with arbitrary channel matrices. Specifically, we show how the DoF-characterization in terms 

'Throughout the paper “almost all” is to be understood with respect to Lebesgue measure and “almost sure” is with 
respect to a probability distribution that is absolutely continuous with respect to Lebesgue measure. 

~A monomial in the variables xi, ...,Xn is an expression of the form ■ ■ -Xn n , with ki £ N. 

3 A real number is called algebraic if it is the zero of a polynomial with integer coefficients. In particular, all rational 
numbers are algebraic. 
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of information dimension, discovered in [3], can be translated, again based on [2], into an alternative 
characterization exclusively involving Shannon entropies. The resulting new DoF-formula is more 
amenable to both analytical statements and numerical evaluation than the one in [3]. To support this 
statement, we show how the alternative DoF-formula can be used to explain why determining the exact 
number of DoF for channel matrices with rational entries, even for simple examples, has remained 
elusive so far. Specifically, we establish that DoF-characterization for rational channel matrices is 
equivalent to very hard open problems in additive combinatorics. Finally, we exemplify the quantitative 
applicability of the new DoF-formula by improving the best-known bounds on the DoF of a particular 
channel matrix studied in [3]. 

Notation: Random variables are represented by uppercase letters from the end of the alphabet. 
Lowercase letters are used exclusively for deterministic quantities. Boldface uppercase letters indicate 
matrices. Sets are denoted by uppercase calligraphic letters. For x £ 1R, we write [x\ for the largest 
integer not exceeding x. All logarithms are taken to the base 2. E[-] denotes the expectation operator. 
//(•) stands for entropy and h(-) for differential entropy. For a measurable real-valued function / and 
a measure 4 p on its domain, the push-forward of /x by / is (f*p)(A) = l)) for Borel sets A. 

Outline of the paper: In Section II, we introduce the system model for constant single-antenna 
ICs. Section III contains our first main result. Theorem 1, providing explicit and almost surely satisfied 
conditions on channel matrices to admit full, i.e., Kj 2 DoF. In Section IV, we review the basic 
material on information dimension, self-similar distributions, and additive combinatorics needed in 
the paper. Section V is devoted to sketching the ideas underlying the proof of Theorem 1 in an 
informal fashion and to introducing the recent result by Hochman [2] that both our main results rely 
on. In Section VI, we formally prove Theorem 1. Section VII presents a non-asymptotic version of 
Theorem 1. In Section VIII, we establish that our sufficient conditions for K/2 DoF are not necessary. 
Our second main result, Theorem 3, which provides a DoF-characterization exclusively in terms of 
Shannon entropies, is presented, along with its proof, in Section IX. Finally, in Section X we discuss 
the formal connection between DoF and sumset theory, a branch of additive combinatorics, and we 
apply the new DoF-formula to channel matrices with rational entries. 


II. System model 


We consider a single-antenna X'-user IC with constant channel matrix H = (h r j)\ F M A xK 


and input-output relation 


K 

Yi = 11ij Vj -f- % — 1,..., AT, 

3 =1 


( 1 ) 


throughout the paper, the terms “measurable” and “measure” are to be understood with respect to the Borel tr-algebra. 
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where X{ G R is the input at the i-th transmitter, Y t G M is the output at the i-th receiver, and Zi G M 
is noise of absolutely continuous distribution such that h(Zi) > —oo and H([Z t \) < oo. The input 
signals are independent across transmitters and noise is i.i.d. across users and channel uses. 

The channel matrix H is assumed to be known perfectly at all transmitters and receivers. We 
impose the average power constraint 



k= 1 


on codewords ... x / l] j of block-length n transmitted by user i = 1 The DoF of this 

channel are defined as 


DoF(H) := limsupGhi^, 

snr—>-oo 2 logsnr 

where C(H;snr) is the sum-capacity of the IC. 


( 2 ) 


III. Explicit and almost sure conditions for K/2 DoF 

We denote the vector consisting of the off-diagonal entries of H by h £ and let/i,/ 2 ,... 

be the monomials in K(K — 1 ) variables, i.e., f t (x\ ...., xk(k- i)) = x'l' ■ ■ • x< K(K~^ly enumerated as 
follows: /i,.... f^(d) are the monomials of degree 5 not larger than d, where 

v(d) ■- ( K( ' K ~ d ^ +d y 

The following theorem contains the first main result of the paper, namely conditions on H to admit 
K/2 DoF that are explicit and satisfied for almost all H. 

Theorem 1: Suppose that the channel matrix H satisfies the following condition: 

For each i = 1 the set 

{fj( h) ■■ j> 1} U {hufj( h) : J ^ 1} (*) 

is linearly independent over Q. 

Then, we have 

DoF(H) = K/2. 

Proof: See Section VI. ■ 

We first note that, as detailed in the proof of Theorem 1, Condition (*) implies that all entries of 
H must be nonzero, i.e., H must be fully connected in the terminology of [7]. By [10, Prop. 1] we 
have DoF(H) Gi K/2 for fully connected channel matrices. The proof of Theorem 1 is constructive 
in the sense of providing input distributions that achieve this upper bound. 

5 The “degree” of a monomial is defined as the sum of all exponents of the variables involved (sometimes called the total 
degree). 
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Let us next dissect Condition (*). A set S C M is linearly independent over O if, for all n € N 
and all pairwise distinct vi,...,v n G S, the only solution q\ , q n G O of the equation 

qivi + ... + q n v n = 0 (3) 

is q\ = ... = q n = 0. Thus, if Condition (*) is not satisfied, there exists, for at least one i €{1 
a non-trivial linear combination of a finite number of elements of the set 

{/j(h) : j > 1} U {hufj{ h) : j 1} 

with rational coefficients which equals zero. In fact, this is equivalent to the existence of a non¬ 
trivial lineal - combination that equals zero and has all coefficients in Z. This can be seen by simply 
multiplying (3) by a common denominator of q \...., q n . 

To show that Condition (*) is satisfied for almost all channel matrices, we will argue that the 
condition is violated on a set of Lebesgue measure zero with respect to H. To this end, we first note 
that for fixed d G N, fixed a\,,b\, b^(d) £ Z not all equal to zero, and fixed i G {1, K}, 

<p(d) <p(d) 

X “j/j(h) + X bjhafjfa) = o (4) 

i = 1 i =1 

is satisfied only on a set of measure zero with respect to H, as the solutions of (4) are given by the 
set of zeros of a polynomial in the channel coefficients. Since the set of equations (4) is countable 
with respect to d G N, a \,..., , b \,..., ^ an d i G {1, K}, the set of channel matrices 

violating Condition (*) is given by a countable union of sets of measure zero, which again has measure 
zero. It therefore follows that Condition (*) is satisfied for almost all channel matrices H and hence 
Theorem 1 provides conditions on H that not only guarantee that K/2 DoF can be achieved but are 
also explicit and almost surely satisfied. 

We finally note that the prominent example from [7] with all entries of H rational, shown in [7] to 
admit strictly less than K/2 DoF, does not satisfy Condition (*), as two rational numbers are always 
linearly dependent over Q. 


IV. Preparatory Material 

This section briefly reviews basic material on information dimension, self-similar distributions, and 
additive combinatorics needed in the rest of the paper. 
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A. Information dimension and DoF 

Definition 1: Let X be a random variable with arbitrary distribution 6 //. We define the lower and 
upper information dimension of X as 

d(X) := liminf H((X)k) an( j d(X) := lirnsup H((X)k) , 
k^oo log A; fc-».oo log k 

where (X)k := [kX\/k. If d(X) = d(X), we set d(X) := d(X) = d(X) and call d(X) the 
information dimension of X. Since d(X),d(X), and d(X) depend on // only, we sometimes also 
write d(n), d(ff ), and d(n), respectively. 

The relevance of information dimension in characterizing DoF stems from the following relation 

[ 11 ], [ 3 ], [ 12 ] 

h(y/snrX + Z) 


lirnsup vv , ---1 = d(X), 

snr—>oo 2 logsnr 


( 5 ) 


which holds for arbitrary independent random variables X and Z, with the distribution of Z absolutely 
continuous and such that h(Z) > —oo and H([Z J) < oo. 

We can apply (5) to ICs as follows. By standard random coding arguments we get that the sum-rate 


i(x r ,Yi) + ■ ■ ■ + i(x k -y k ) 


( 6 ) 


is achievable, where X\,...,Xk arc independent input distributions with E[2f|] ^ 1, i = 1,.... K. 
Using the chain rule, we obtain 


I(Xi-,Yi) = h(Yi) - h(Yi \Xi) 

( K \ ( K 

= h v / snt : ^ hijXj + Zi ] -h I x/snr ^ hijXj + Zi 

V j= t / V j+i 

for i = 1, Combining (5)-(8), it now follows that [3] 
dof(X 1 , ...,Xk', H) := 


( 7 ) 

(8) 


K 

E 

i=1 


K 


I< 


d ( hijXj | d | hijXj 

V j =1 J \ j^i 


( 9 ) 

( 10 ) 


^ DoF(H), 

for all independent Xi,...,Xk with 7 E[X' 2 ] < oo, i = 1,.... K, and such that all information 
dimension terms appearing in (9) exist. A staking result in [3] shows that inputs of discrete, 
continuous, or mixed discrete-continuous distribution can achieve no more than 1 DoF irrespective of 


6 We consider general distributions which may be discrete, continuous, singular, or mixtures thereof. 

7 We only need the conditions E[Xf] < oo as scaling of the inputs does not affect dof(Xi,..., Xk\ H). 
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K. For K > 2, input distributions achieving /i/2 (i.e., full) DoF therefore necessarily have a singular 
component. 

Taking the supremum in (10) over all admissible X±, ...,X yields 


K 


DoF(H) ^ sup y^ 

• v '. v '. 


K 


K 


d I ^ hijXj d I hijXj 


J =i 




( 11 ) 


It was furthermore discovered in [3] that equality in (11) holds for almost all channel matrices H; an 
explicit characterization of this “almost-all set”, however, does not seem to be available. The right- 
hand side (RHS) of (11) can be difficult to evaluate as explicit expressions for information dimension 
arc available only for a few classes of distributions such as mixed discrete-continuous distributions 
or (singular) self-similar distributions reviewed in the next section. 


B. Self-similar distributions and iterated function systems 

A class of singular distributions with explicit expressions for their information dimension is given 
by self-similar distributions [13]. What is more, self-similar input distributions can be constructed to 
retain self-similarity under linear - combinations, thereby allowing us to get explicit expressions for 
the information dimension of the output distributions in (9). For an excellent in-depth treatment of 
the material reviewed in this section, the interested reader is referred to [14], 

We proceed to the definition of self-similar distributions. Consider a finite set ( I> 7 . := {< p^ r '■ i = 
1,..., n} of affine contractions <pi, r : M —>• M, i.e., 

Vi , r ( x ) = rx + Wi , (12) 

where r € I C (0,1) and the Wi are pairwise distinct real numbers. We furthermore set W := 
{uq ,..., w n }. T, is called an iterated function system (IFS) parametrized by the contraction parameter 
r € I- By classical fractal geometry [14, Ch. 9] every IFS has an associated unique attractor, i.e., a 
non-empty compact set A C M such that 

n 

A = |J ( 13 ) 

i —1 

Moreover, for each probability vector (pi,...,p n ), there is a unique (Borel) probability distribution 
p r on M such that 

n 

p, r — y 7 Pi i}Pi,r )*/A; (14) 

i = 1 

where (< pi, r )*Pr is the push-forward of //,. by < p^ r . The distribution /; r is supported on the attractor 
set A in (13) and is referred to as the self-similar distribution corresponding to the IFS <J> r with 
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underlying probability vector (p\,....p n ). We can give the following explicit expression for a random 
variable X with distribution p r as in (14) 

OO 

X = Y j r k W k , (15) 

k =0 

where {W k }k ^o i s a set of i-i.d. copies of a random variable W drawn from the set VV according to 

(Pl,—,Pn)- 


C. A glimpse of additive combinatorics 

The common theme of our two main results is a formal relationship between the study of DoF 
in constant single-antenna ICs and the field of additive combinatorics. This connection is enabled 
by the recent breakthrough result in fractal geometry reported in [2] and summarized in Section V. 
We next briefly discuss material from additive combinatorics that is relevant for our discussion. For 
a detailed treatment of additive combinatorics we refer the reader to [9]. Specifically, we will be 
concerned with sumset theory, which studies, for discrete sets U , V, the cardinality of the sumset 
U + V = {u + v : V) relative to \U\ and |V|. We begin by noting the trivial bounds 

max{|W|,|V|} < \U + V\ \U\ ■ |V|, (16) 

for U and V finite and non-empty. One of the central ideas in sumset theory says that the left-hand 
inequality in (16) can be close to equality only if U and V have a common algebraic structure 
(e.g., lattice structures), whereas the right-hand inequality in (16) will be close to equality only if 
the pairs U and V do not have a common algebraic structure, i.e., they arc generic relative to each 
other. Figure 1 illustrates this statement. Algebraic structures relevant in this context are arithmetic 
progressions, which are sets of the form S = {a, a + d,a + 2d,... ,a + (n — 1 )d} with a € Z and 
d £ N. If U and V arc finite non-empty subsets of Z, an improvement of the lower bound in (16) to 
\U | + |V| — 1 ^ |(Y + V| can be obtained. This lower bound is attained if and only if U and V arc 
arithmetic progressions of the same step size d [9, Prop. 5.8]. 

An interesting connection between sumset theory and entropy inequalities was discovered in [15], 
[16]. This connection revolves around the fact that many sumset inequalities have analogous versions 
in terms of entropy inequalities. For example, the entropy version of the trivial bounds (16) is 

rna x{H(U), H(V)} f H(U + V) ^ H(U) + H(V), 

where U and V are independent discrete random variables. Less trivial examples are the sumset 
inequalities [9], [17] 

\u + v\ ■ \U\ ■ |V| ^ \U — V| 3 

\u- vk l« + v | 1/2 • (\u\ ■ |V|) 2/3 , 
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o o o 
o o o o 


ooo-|-ooo — ooooo 


0^0 


0^0 


o o w o o 
o o o 


(a) Sum of two sets with common algebraic structure. 


o o 


o 0 0 


+ 



o o o o 


o oo o o o 

o o o o o o 

o o o o 

o oo ooooo o 

o o Orj o 

O O Cr-'O o o 

o o o o o o 


o o o o 


(b) Sum of two sets with different algebraic structures. 

Fig. 1: The cardinality of the sum in (a) is 19 and hence small compared to the 7 2 = 49 pairs summed up, 
whereas the sum in (b) has cardinality 49. 


for finite non-empty sets U, V, with their entropy counterparts [15], [16] 

H(U + V) + H(U) + H(V)^3H(U-V) (17) 

H{U-V)^H{U + V) + 1{H{U) + H{V)) (18) 

for independent discrete random variables U, V. Note that due to the logarithmic scale of entropy, 
products in sumset inequalities are replaced by sums in their entropy versions. 

V. The cornerstones of the proof of Theorem 1 

In this section, we discuss the main ideas and conceptual components underlying the proof of 
Theorem 1. First, we note that, as already pointed out in Section III, by [10, Prop. 1] we have 
DoF(H) ^ K/2 for all H satisfying Condition (*). To achieve this upper bound, we construct 
self-similar input distributions that yield doffA'i...., Xf(: H) = K/2 for channel matrices satisfying 
Condition (*). Specifically, we take each input to have a self-similar distribution with contraction 
parameter r, i.e., X{ = rk Wi,k, where, for i = {Wi t k : k ^ 0} are i.i.d. copies of a 

discrete random variable 8 W t with value set VV ( . possibly different across i. For the random variables 
Y/j hijXj appearing in (11) we then have 

OO OO 

E h v X i EE r kh ijW jik = E r k E hijW^ (19) 

j j k =0 k =0 j 

and thus JN hijXj is again self-similar with contraction parameter r. The “output-W’” set, i.e., the 
value set of )T7 hijWj is then given by hijWj- 

^Henceforth “discrete random variable” refers to a random variable that only takes finitely many values. 
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Next, we discuss conditions on Xj and }\ Vj under which analytical expressions for the information 
dimension of Ylj hijXj can be given. For general self-similar distributions arising from iterated 
function systems classical results in fractal geometry impose the so-called open set condition [18, 
Thm. 2], which requires the existence of a non-empty bounded set U C M such that 

n 

lJw,r(W)CW (20) 

i =1 

and ipi, r (U) FI (fij^riU) = 0, for all i / j, (21) 


for the ipi^r defined in (12). Wu et al. [3] ensure that the open set condition is satisfied by imposing 
an upper bound on the contraction parameter r according to 

m(W) 


r sj 


( 22 ) 


m(W) + M(W) ’ 

where m(W) := min^j \wi — Wj\ and M(>V) := maxjj \wt — Wj\. The challenge here resides 
in making (22) hold for the output-VV set. In [3] this is accomplished by building the input sets 
W’, from Z-linear combinations (i.e., linear combinations with integer coefficients) of monomials in 
the off-diagonal channel coefficients and then recognizing that results in Diophantine approximation 
theory can be used to show that (22) is satisfied for almost all channel matrices. Unfortunately, it 
does not seem to be possible to obtain an explicit characterization of this “almost-all set”. Recent 
groundbreaking work by Hochman [2] replaces the open set condition by a much weaker condition, 
which instead of (20), (21) only requires that the IFS must not allow “exact overlap” of the images 
<Fi,r(Zl) and Lpj. r (A), for i / j, which we show in Theorem 2 below can be satisfied by “wiggling” 
with r in an arbitrarily small neighborhood of its original value. This improvement turns out to be 
instrumental in our Theorem 1 as it allows us to abandon the Diophantine approximation approach 
and thereby opens the doors to an explicit characterization of an “almost-all set” of full-DoF admitting 
channel matrices. Specifically, we use the following simple consequence of [2, Thm. 1.8]. 

Theorem 2: If / C (0,1) is a non-empty compact interval which does not consist of a single point 
only, and p r is the self-similar distribution from (14) with contraction parameter r £ I and probability 
vector (pi, ...,p n ), then 9 


m \ ■ iT. Pt lo g IH 1 

d(u r ) = mm< ---, 1 

[ logr 

for all r £ I\E, where E is a set of Hausdorff and packing dimension zero. 
Proof: For i £ {1,..., n} fc , let := , r o ... o ip ik and define 


(23) 


Ai,j(r) <pi ir (0) <pj r (0), 


9 The “1” in the minimum simply accounts for the fact that information dimension cannot exceed the dimension of the 


ambient space. 
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for i,j € {1,.... n} k . Extend this definition to infinite sequences i,j € {1, ...,n} N according to 


A ij(0 


lim A 

k —^oo 






Using (12) it follows that 

OO 

A i,j( ? ) = rfc_1 ( W 4 - w h)- 

k =1 


Since a power series can vanish on a non-empty open set only if it is identically zero, we get that 
Ai.j = 0 on I if and only if i = j, as a consequence of the Wi being pairwise distinct and I containing 
a non-empty open set. This is precisely the condition of [2, Thm. 1.8] which asserts that (23) holds 
for all r € I with the exception of a set of Hausdorff and packing dimension zero, and thus completes 
the proof. ■ 

Remark 1: Note that (23) can be rewritten in terms of the entropy of the random variable W, 
defined in (15), which takes value Wi with probability pp. 

d(p r ) = m in{jAffl_ ^ . (24) 

Remark 2: The concepts of Hausdorff and packing dimension have their roots in fractal geometry 
[14]. In the proofs of our main results, we will only need the following aspect: For I as in Theorem 2, 
we can always find an r € I\E for which (23) holds. This can be seen as follows: I\E = 0 implies 
that E contains a non-empty open set and therefore would have Hausdorff and packing dimension 1 
[14, Sec. 2.2], 

Remark 3: The strength of Theorem 2 stems from (23) holding without any restrictions on the 
Wi € VV. In particular, the elements in the output-W set ■ hijWj may be arbitrarily close to each 
other rendering (22), needed to satisfy the open set condition, obsolete. 

We next show how Theorem 2 allows us to derive explicit expressions for the information dimension 
terms in (9). 

Proposition 1: Let r € (0,1) and let W\ ,..., Wk be independent discrete random variables. Then, 
we have 


K 

E 

1=1 


mm< 


HiELhijWj 


log(l/r) 


1 > — mini 


mz&ihijWj 

l°g(l/r) 


,1 


sC DoF(H). 


(25) 


Proof: For i = 1, let {Wi^ '■ k ^ 0} be i.i.d. copies of Wi. We consider the self-similar 

inputs X t = JXq rk W ijk , for i = 1, Then, the signals 

K oo K 

hijXj = ^ r" hijWj^k 

j = 1 k =0 3=1 

K oo K 

and y ^ hijXj — ^ ^ r y ^ hijWj ^ 
j¥=i k= 0 j^i 
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also have self-similar distributions with contraction parameter r. Thus, by Theorem 2, for each s > 0, 
there exists an r in the non-empty compact interval I £ := [r — £, r] (which does not consist of a 
single point only for all e > 0) such that 

/ K 

d I Y^ hijXj 
\j =i 

/ K 

and d E hijXj 

W* 

For e —^ 0 we have log(l fr) —>• log(l/r) by continuity of log(-). Thus, inserting (26) and (27) into 
(10) and letting e —)• 0, we get (25) as desired. ■ 

The freedom we exploit in constructing full DoF-achieving X, lies in the choice of W \,..., Wk 
which thanks to Theorem 2, unlike in [3], is not restricted by distance constraints on the output-W 
set. For simplicity of exposition, we henceforth choose the same value set W for each W{. We want 
to ensure that the first term inside the sum (9) equals 1 and the second term equals 1/2, for all i, 
resulting in a total of K/2 DoF. It follows from (26), (27) that this can be accomplished by choosing 
the Wi such that 



K 


I\ 


H huWi + hijWj « 2 H E hijWj 






(28) 


followed by a suitable choice of the contraction parameter. Resorting to the analogy of entropy and 
sumset cardinalities sketched in Section IV-C, the doubling condition (28) becomes 


K 


I< 

2 

huW + Y, hijW 



, (29) 






which effectively says that the sum of the desired signal and the interference should be twice as 

“rich” as the interference alone. Note that by the trivial lower bound in (16) 

I K I 


\huW\ 


|wk 


hijW ! 
j¥=i 


(30) 


and, by the trivial upper bound in (16) 


I\ 


K 

huW + Y h vW 

^ \hi,W\ ■ 

Y h ^ w 



j¥=i 


(31) 


The doubling condition (29) can therefore be realized by constructing W such that the inequalities 
(30) and (31) are close to equality. In particular, this means that (cf. Section IV-C) 

A) the terms in the sum hijW must have a common algebraic structure and 

B) h u W and hijW must not have a common algebraic structure. 

The challenge here is to introduce algebraic structure into VV so that A) is satisfied but at the same 
time to keep the algebraic structures of the sets h ri VV and Yf^ihijW different enough so that B) 
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is met. Before describing the specific construction of W, we note that the answer to the question of 
whether the sets h tJ W have a common algebraic structure or not depends on the channel coefficients 
hij. As we want our construction to be universal in the sense of (29) holding independently of the 
channel coefficients, a channel-independent choice of W is out of the question. Inspired by [6], we 
build IV as a set of Z-linear combinations of monomials (up to a certain degree d G N) in the off- 
diagonal channel coefficients, i.e., the elements of W are given by Y^j=\ Oj/j(h), for a j £ {1) •••> N} 
with AAN. This construction satisfies A) by inducing the same algebraic structure for hijW, j / i, 
independently of the actual values of the channel coefficients hij, j / i. To see this, first note that 
multiplying the elements a jfjO^) °f VV by an off-diagonal channel coefficient hij, j / i, simply 

increases the degrees of the participating fj( h) by 1. For d sufficiently large the number of elements 
that do not appear both in h l3 W and VV is therefore small, rendering /i t; VV. j / i, algebraically 
“similar” to VV. which we denote as h ^W ~ VV. We therefore get Ylj^i hijW ~ VV + ... + W as 
the sum of K — 1 sets with shared algebraic structure and note that the elements of W + ... + W 
are given by X^=i a jfjO^) with aj G {1,..., (K — 1)AT}. Choosing N to be large relative to I\, we 
finally get | Yhj^i hijW\ ~ VV’|. As for Condition B), we begin by noting that h a does not participate 
in the monomials fj( h) used to construct the elements in VV. This means that h tJ W consists 
of Z-linear combinations of fj( h), while h u VV consists of Z-linear combinations of hufj( h). By 
Condition (*) the union of the sets {/,•(h) : j ^ 1} and {hafj( h) : ) V 1} is linearly independent 
over Q, which ensures that huYV and hijVd do not share an algebraic structure. 


VI. Proof of Theorem 1 


Since a set containing 0 is always linearly dependent over O, Condition (*) implies that all entries 
of H must be nonzero, i.e., H must be fully connected. It therefore follows from [10, Prop. 1] that 
DoF(H) ^ Kf 2. 

The remainder of the proof establishes the lower bound DoF(H) ^ K/2 under Condition (*). Let 

N and d be positive integers. We begin by setting 

( Ad) V 

nv := | Y^ a ifi( h) : ai,..., a^ d) G {1,AT} j (32) 

and r := VV.vl 2 - Let W\, ...,Wk be i.i.d. uniform random variables on Wat. By Proposition 1 we 


then have 



^ DoF(H). 


(33) 
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Note that the random variable y^j^i hijWj takes value in 

ip(d+ 1) 

J 


(<p(d+ 1) 

X aifi(h) : ai,...,a^ d+1) e -1)N} 

( *=t 


(34) 


By Condition (*) the set {/,(h) : j A I} is linearly independent over Q. Therefore, each element in 
the set (34) has exactly one representation as a Z-linear combination with coefficients a\, ■■■,a ip ( df +i) C 
{1, (K — 1 )N}. This allows us to conclude that the cardinality of the set (34) is given by ((K — 
l)N)^ d+l \ which implies H (j2j J tihijWj') ^ ip(d + 1) log((iT — 1 )N). Similarly, we find that 
| WV | = and thus get 


< tp(d+l)log{(K-l)N) 


2 log | WV | 


2 <p(d) log N 


d,N —Too 1 
> 2 ’ 


where we used 


(p{d +1) if (A — 1) + d + 1 d—>■ oo 


tp(d) d + 1 

We next show that Condition (*) implies that 


> 1. 


H huWi + X h ii w i = Hi huWuY hijWj 


3+i 


j¥=i 


Applying the chain rule twice we find 


H huWi , Y hijWj H huWi , Y hijWj, huWi + Y h v W 3 


j¥=i 


3+i 




H huWi -1 J] h i3 W 3 ) ■ 11 ( huWi, Y hijWj 


3+i 


and therefore proving (38) amounts to showing that 


3+i 


(35) 

(36) 

(37) 

(38) 

(39) 

huWi + y ^ hjjWj , 


j¥=i 


H huWi, Y h i3 W 3 

V j/* 


huWi + 'y ' h^Wj j — 0. 

3+i ) 


(40) 


(41) 


In order to establish (41), suppose that mi ,wk and mi,..., vjk are realizations of W\ , Wk such 
that 


or equivalently 


ha.Wi T y ^ hijWj — huWi T y ^ hi 
j¥=i j¥=i 


ha{wi - Wi) + X hij(wj - Wj) = 0. 
j¥=i 


(42) 


(43) 


The first term on the left-hand side (LHS) of (43) is a Z-linear combination of elements in {hafj( h) : 
j ^ 1}, whereas the second term is a Z-linear combination of elements in {/,•(h) : j ^ 1}. Thanks 
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to the linear independence of the union in Condition (*), it follows that the two terms in (43) have 
to equal zero individually and hence Wi = w, and This shows that the 

sum h;;\\) ■ V ;/; hijWj uniquely determines the terms h ll W l and hijWj and therefore proves 
(41). Next, we note that 


K 


K 


H E hij Wj I — H I hi i Wi + 'y ' hij Wj 


J= 1 




I< 


H h u Wi,J2 hijWj 




H(h u Wi) + Hl J2 h ij W j)’ 
\ j# J 


(44) 

(45) 

(46) 


where the last equality is thanks to the independence of the Wj, 1 ^ j ^ K. Putting the pieces 
together, we finally obtain 


H (Ejii hijWj )-H hijW 3 
2 log | Wn | 

H(hiiWi) ip(d) log N 1 


'K 


(47) 

2(p(d) log N 2p(d)\ogN 2’ 

where we used the scaling invariance of entropy, the fact that lly is uniform on W, and W = N^ d \ 
This allows us to conclude that, for all d and N, we have 


mm < 


2 log |VV/v | 


1 > — min< 


2 log | Wn | 


1 > ^ 1- 


ip(d + 1) log((Jv - 1)JV) 
2 (p(d) log N 


(49) 


as either the first minimum on the LHS of (49) coincides with the non-trivial term in which case by 
(46) the second minimum coincides with the non-trivial term as well, and therefore by (48) the LHS 


of (49) equals 1/2 ^ 1 — 


<p{d+l)\og({K-l)N) 


"’c annl" min 1 ,/!N --'" . 

we apply mtn<; —atoglw™ 


2 (f(d) log N 


, or the first minimum coincides with 1 in which case 


1 > < - 2 %v;r < ^ d+ M°dfiogN 1)N) ’ where we used ( 35 ) f ° r the 


2 log I Wat I 

second inequality. As, by (36), the RHS of (49) converges to 1/2 for d, N —>• oo, it follows that the 
LHS of (33) is asymptotically lower-bounded by K/2. This completes the proof. ■ 


VII. Non-asymptotic statement 

Given a channel matrix H verifying Condition (*) in theory requires checking infinitely many 
equations of the form (4). It is therefore natural to ask whether we can say anything about the DoF 
achievable for a given H when (4) is known to hold only for finitely many coefficients a,j , bj and up to 
a finite degree d. To address this question we consider the same input distributions as in the proof of 
Theorem 1 and carefully analyze the steps in the proof that employ Condition (*). Specifically, there 


June 8. 2015 


DRAFT 



16 


are only two such steps, namely the argument on the uniqueness of the representation of elements in 
the set (34) and the argument leading to (46). First, as to uniqueness in (34) we need to verify that 

<p(d+ 1) <p(d+ 1) 

Y + Y ( 5 °) 

3 = 1 j=l 

for all aj,a,j G {l,...,(iF — l)iV} with (ai,aw< 2 +i)) / (at, ow< 2 +i)). Note that we have to 
consider monomials up to degree d + 1, as the multiplication of Wj by an off-diagonal channel 
coefficient hij increases the degrees of the involved monomials by 1, as already formalized in (34). 
Second, to get (46), we need to ensure that h n W t + uniquely determines h^W, and 

hijWj, for i = 1,..., K, which amounts to requiring huWi + Yljjii h ij w j / h u™i + hijWj 
whenever (h ll w l , h l3 Wj) / (h ll w l , h t] Wj). Inserting the elements in (32) for Wi,Wi this 
condition reads 

ip(d+l) <p(d) tf(d+l) tp(d) 

Y + Y b 3 h nfj(^) + Y + Y^nM* 1 ), ( 51 ) 

3 = 1 i =1 3 = 1 i =1 

for all a,j,dj G {1,..., (K — l)iV} and bj,bj G {1, ...,iV} with 


(®1 i ..•) ®(^(d+l) > ^1) b(p(d) ) 7^ (®1 j ®<^(d+l) j ^1 > b(p(d) ) • 


Note that (50) is a special case of (51) obtained by setting bj = bj, for all j, in (51). Finally, rearranging 
terms we find that (51) simply says that non-trivial Z-linear combinations of the elements participating 
in Condition (*) do not equal zero, which in turn is equivalent to (4) restricted to a finite number of 
coefficients and a finite degree. 

Now, assuming that, for a given H, (51) is verified for all aj,dj, bj, bj and fixed d and N, we can 
proceed as in the proof of Theorem 1 to get the following from (49): 


mm< 


H(Ef= ihijWj 
l°g(l /r) 


1 > — min< 




log(l/r) 


1 


5 > 1 - 


= 1 - 


(p(d + 1) log((K — 1)N) 

2 tp(d) log N 

(. K{K - 1) + d + 1) log ((if - l)iV) 


2 (d + 1) log AT 

Upon insertion into (33) this yields the DoF lower bound 


K 

~2 


2 - 


(K(K - 1) + d + 1) log (JJ£_ - 1)N) 
(i d + 1) logiV 


VIII. Condition (*) is not necessary 

While Condition (*) is sufficient for DoF(H) = K/ 2, we next show that it is not necessary. 
This will be accomplished by constructing a class of example channel matrices that fail to satisfy 
Condition (*) but still admit K/2 DoF. As, however, almost all channel matrices satisfy Condition (*) 
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this example class is necessarily of Lebesgue measure zero. Specifically, we consider channel matrices 
that have ha G R\Q, i = 1 and hij G Q\{0}, for i. j = 1 with i / j. This 

assumption implies that all entries of H are nonzero, i.e., H is fully connected, which, again by [10, 
Prop. 1], yields DoF(H) K/2. Moreover, as two rational numbers are linearly dependent over Q, 
these channel matrices violate Condition (*). We next show that nevertheless DoF(H) ^ K/2 and 
hence DoF(H) = K/2. This will be accomplished by constructing corresponding DoF-optimal input 
distributions. 

We begin by arguing that we may assume hij G Z, for i / j. Indeed, since DoF(H) is invariant 
to scaling of rows or columns of H by a nonzero constant [12, Lem. 3], we can, without affecting 
DoF(H), multiply the channel matrix by a common denominator of the hij, i / j, thus rendering 
the off-diagonal entries integer-valued while retaining irrationality of the diagonal entries 

Let 




(52) 


for some N > 0, and take W\ ,..., Wk to be i.i.d. uniformly distributed on W. We set the contraction 
parameter to 

r — 2 —2 log(2/). max A'A r ) (53) 

where h max := max{\hij\ : i / j}. Writing J2f=i h ij w j = hu ■ W t + 1 • where 

Wi, J2j^i hijWj G Z, and realizing that {/),,. 1} is linearly independent over Q, we can mimic the 

arguments leading to (46) to conclude that 

/ k \ / \ 


» £ hijWj = H(h n W t ) + H E hijWj I, 


(54) 


• 7=1 


3 & 


for i = 1,..., K. In fact, it is precisely the linear independence of {h u . 1} over O that makes this 
example class work. Next, we note that 


K 


^2 h ij W j e { h max (/K - l)N, ..., 0, ..., /i-max (K - 1)N} 


and hence hijWjj E log(2/r max iG7V). Since the W n 1 E .) E K, arc identically distributed, 

we have H (/ ijj II)) — hi (/ ijj H)), for all x,j, and therefore hi (h j j II)) Gi hijWj) as a 

consequence of the fact that the entropy of a sum of independent random variables is greater than 
the entropy of each participating random variable [19, Ex. 2.14], Thus (54) implies that 


K 


I\ 


H \ E h^Wj J E 2// E hijWj E 21og(2 h. max KN). 


. 7=1 


j¥=i 
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With (53) we therefore obtain 


mm < 


H [Ylf=i h ij w j 

log(l/r) 


n = 


log(l/r) 


and since 


K 


K 




(55) 


JAi 




again by [19, Ex. 2.14], we also have 

h ( e k 




jjLi hijWj 


log(l/r) 


,1> = 


H(E wW 


log(l/r) 


Applying Proposition 1 with (54) and using H(huWi ) = log N, we finally obtain 


£2=1 HihaWi) _ K log N 


DoF(H) ^ 


K log N 


(56) 


log(l/r) log(l/r) 2\og{2h mm KN)' 

Since (56) holds for all N, in particular' for N —>• oo, this establishes that DoF(H) ^ K/2 and 
thereby completes our argument. 

Recall that in the case of channel matrices satisfying Condition (*) the value set VV in (32) is 
channel-dependent. Here, however, the assumption of the diagonal entries of H being irrational and 
the off-diagonal entries rational already induces enough algebraic structure for our arguments to 
work. In the case of channel matrices satisfying Condition (*) we induce an algebraic structure that 
is shared by all participating channel matrices through the choice of the channel-dependent set W 
and by enforcing Condition (*). We conclude by noting that the example class studied here was 
investigated before in [7, Thm. 1] and [3, Thm. 6]. In contrast to [3], [7] our proof of DoF-optimality 
is, however, not based on arguments from Diophantine approximation theory. 


IX. DoF-characterization in terms of Shannon entropy 


To put our second main result, reported in this section, into context, we first note that the DoF- 
characterization [3, Thm. 4], see also (11) and the statement thereafter, is in terms of information 
dimension. As already noted, information dimension is, in general, difficult to evaluate. Now, it 
turns out that the DoF-lower bound in Proposition 1 can be developed into a full-fledged DoF- 
characterization in the spirit of [3, Thm. 4], which, however, will be entirely in terms of Shannon 
entropies. 


Theorem 3: Achievability : For all channel matrices H, we have 


sup 

w 1: ...,w K 


Ef= 1 H u2j=i hijWj J - hijWj 


max i= i ) ... ) K H ( hijWj 


A < DoF(H) 


(57) 


June 8. 2015 


DRAFT 



19 


where the supremum in (57) is taken over all independent discrete W\ ,..., Wk such that the 
denominator in (57) is nonzero. 10 

Converse : We have equality in (57) for almost all H including channel matrices with all off-diagonal 
entries algebraic numbers and arbitrary diagonal entries. 

Proof: We begin with the proof of the achievability statement. The idea of the proof is to 
apply Proposition 1 with a suitably chosen contraction parameter r. Specifically, let W \,..., Wk 
be independent discrete random variables such that the denominator in (57) is nonzero, and apply 
Proposition 1 with 

r ._ 2“ max i = l. K 


which ensures that all minima in (25) coincide with the respective non-trivial terms. Specifically, for 
i = 1,..., K, we have 


mm < 


and min < 


log(l/r) 

g(sj*w- 

log(l/r) 

I< 


,1 = 




ma H ( J2f= i hijW, 


n = 


HiE&ibijWj 


max i= i. K H ( hijWj 


where the latter follows from H ^ Ylj=i h %3 Wj ) f H hijWj) (cf. (55)). Proposition 1 now 

yields 


y k 

H\ 

(EjLi hijWj 

)~ H ' 


hijWj ) 

i 

maxj = i r .. ; x#( 

Eflt 

hijWj) 

i 


< DoF(H). 


(58) 


Finally, the inequality (57) is obtained by supremization of the LHS of (58) over all admissible 
W U ...,W K . 


To prove the converse, we begin by refemng to the proof of [3, Thm. 4], where the following is 
shown to hold for almost all H including channel matrices H with all off-diagonal entries algebraic 
numbers and arbitrary diagonal entries: For every 5 > 0, there exist independent discrete random 
variables W\,Wk and an r £ (0,1) satisfying 11 


log(l/r) 


^ max H 



(59) 


"’This condition only excludes the cases where all W, that appear with nonzero channel coefficients are chosen as 
deterministic. In fact, such choices yield dof(Xi, = 0 (irrespective of the choice of the contraction parameter 

r) and are thus not of interest. 

"This statement is obtained from the proof of [3, Thm. 4] as follows. The IT’, and r here correspond to the Wi and r" 
defined in [3, Eq. (146)] and [3, Eq. (147)], respectively. The relation in (59) is then simply a consequence of [3, Eq. (153)] 
and the cardinality bound for entropy. 
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such that 


E'i, |ff(E?_, hijWj ) - H(Y.Uhi,W i 


DoF(H) < i + 


'K 


~K 


l°g(l/r) 


(60) 


By (59) it follows that 


Eh fflEh hijWj ) - Hi EE, hiiW, 


K 

j r, "'j' > j 


Eh H[EU hijWA-HlEhhijWj 


log(l/r) 






K 


maxi=i,.,tf H(Ef= i hijWj 

Finally, letting <5 —>• 0 and taking the supremum over all admissible W \...., Wk, we get 


E£i \H{T.U h ‘i w i) -^(ESmW 


DoF(H) ^ sup 

w u ...,w K 


sK 


'I\ 


max i= i ) ... )X - if (^f=i , 

for almost all H including channel matrices H with all off-diagonal entries algebraic numbers and 
arbitrary diagonal entries. This completes the proof. ■ 

Remark 4: In the achievability part of Theorem 3, we have actually shown that for all H 


E£i Iff (EU h » w i) - H (EJW h it w. 


sup 
W U ...,W K 

K 

< sup V 

x u ...,x K fri 




yK 




A" 


K 


d I hjjXj d I hjjXj 


J =1 




(61) 


which combined with (11) yields (57). The LHS of (61) is obtained by reasoning along the same 
lines as in the proof of Proposition 1, namely by applying the RHS of (61) to self-similar X\..... Xk 
with suitable contraction parameter r, invoking Theorem 2, and noting that the supremization is then 
carried out over a smaller set of distributions. By Theorem 3 we know that our alternative DoF- 
characterization is equivalent to the original DoF-characterization in [3, Thm. 4], i.e., (61) holds with 
equality, for almost all H including H-matrices with all off-diagonal entries algebraic numbers and 
arbitrary diagonal entries, since in all these cases we have a converse for both DoF-characterizations. 
As shown in the next section, this includes cases where DoF(H) < K/2. Moreover, the two DoF- 
characterizations arc equivalent on the “almost-all set” characterized by Condition (*), as in this case 
the LHS of (61) equals K/2 and therefore by (11) and DoF(H) ^ K/2 [10, Prop. 1], we get that 
the RHS of (61) equals K/2 as well. What we do not know is whether (61) is always satisfied with 
equality, but certainly the set of channel matrices where this is not the case is of Lebesgue measure 
zero. 

Remark 5: Compared to the original DoF-characterization [3, Thm. 4] the alternative expression in 
Theorem 3 exhibits two advantages. First, the supremization has to be earned out over discrete random 
variables only, whereas in [3, Thm. 4] the supremum is taken over general input distributions. Second, 
Shannon entropy is typically much easier to evaluate than information dimension. Our alternative 
characterization is therefore more amenable to both analytical statements and numerical evaluations. 
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This is demonstrated in the next section, where we put the new DoF-characterization to work to 
explain why determining the exact number of DoF for channel matrices with rational entries has 
remained elusive so far, even for simple examples. In addition, we will exemplify the quantitative 
applicability of our DoF-formula by improving upon the best-known bounds on the DoF of a particular 
channel matrix studied in [3]. 


X. DOF CHARACTERIZATION AND ADDITIVE COMBINATORICS 


In this section, we apply our alternative DoF-characterization in Theorem 3 to establish a formal 
connection between the characterization of DoF for arbitrary channel matrices and sumset problems 
in additive combinatorics. We also show how Theorem 3 can be used to improve the best known 
bounds on the DoF of a particular channel matrix studied in [3]. 

We begin by noting that according to [7, Thm. 2] channel matrices with all entries rational admit 
strictly less than K/2 DoF, i.e., 

DoF(H) < |. 


However, finding the exact number of DoF for rational H, even for simple examples, turns out to 
be a very difficult problem. Based on our alternative DoF-characterization (57) in Theorem 3, which 
here holds with equality as all entries of H arc rational, we will be able to explain why this problem 
is so difficult. Specifically, we establish that characterizing the DoF for H with all entries rational 
is equivalent to solving very hard problems in sumset theory. As noted before, however, finding the 
exact number of DoF is difficult only on a set of channel matrices of Lebesgue measure zero, since 
DoF(H) = K/2 for almost all H. 

The simplest non-trivial example is the 3-user case with 


H 


hi 0 0 


\ 


h-2 h 3 0 

\h4 h 3 he J 


where h\,...,he € Q\{0}. Since DoF(H) is invariant to scaling of rows or columns of H by a 
nonzero constant [12, Lem. 3], we can transform this channel matrix as follows: 

/inn\ / 1 0 0^ ^1 

h 2 /i.3 0 I —> I h 2 h 3 0 


Ori 0 0 ^ 

h 2 h 3 0 
y>4 h 5 h e y 


1 !M. j 

h 4 h 4 f 


1 hb 
1 hi 


7 


0 


1 

V 1 


h 3 h 4 


1 


0 

0 

V 


We can therefore restrict ourselves to the analysis of channel matrices of the form 

/ \ 

1 0 0 

I 

H A = 


1 A 0 

V 1 1 1 


(62) 
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where A € O\ {0}. This example class was studied before in [3], [7]. In particular, using the DoF- 
characterization in terms of information dimension (11), Wu et al. showed that [3, Thm. 11] 


DoF(H a ) = 1 + sup [d{X 1 + XX 2 ) - d(X i + X 2 )\, (63) 

x lt x 2 


where the supremum is taken over all independent X\,X 2 such that E[Af 2 ], E[Al|] < oo and the 
appearing information dimension terms exist. Based on (63) one can lower-bound DoF(H A ) through 
concrete choices for the input distributions X\ and X 2 . If one is interested in analytical expressions, 
these choices are, however, restricted to input distributions that allow analytical expressions for the 
information dimension terms appealing in (63). Upper bounds on DoF(H A ) can be established by 
employing general upper and lower bounds on information dimension. However, there is not much 
one can get beyond what basic inequalities deliver. 

By applying Theorem 3 to the channel matrix (62), we next develop an alternative characterization 
to (63). The resulting expression for DoF(H A ) involves the minimization of the ratio of entropies of 
lineal - combinations of discrete random variables and is analytically and numerically more tractable 
than (63). 

Theorem 4: For 


H 


A — 


1 

1 

V 1 


0 0 ^ 

A 0 , 

1 V 


we have 


DoF(H A ) = 2 - inf 


H(U + V) 


(64) 


u,v H{U + XV) ’ 

where the infimum is taken over all independent discrete random variables U, V such that 12 H(U + 
XV) > 0. 


Proof: As the off-diagonal entries of H A are all rational and therefore algebraic numbers, we 
have equality in (57), which upon insertion of H A yields 

H{U + XV) + H{U + V + W) - H(U + V) 


DoF(H A ) = sup 


uy ,w max{H(U),H(U + XV),H{U + V + W)} ’ 


(65) 


where the supremum is taken over all independent discrete random variables U, V, W such that the 
denominator in (65) is nonzero. Now, again using [19, Ex. 2.14], we have H(U) ^ H(U + AU), 


12 Again, this condition simply prevents the denominator in (64) from being zero. The case H(U + XV) = 0 is equivalent 
to U and V deterministic. This choice would, however, yield dof(Xi,..., AT;H) ^ 1 and is thus not of interest. 
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which when inserted into (65) yields 

DoF(H A ) = sup 

u,v,w 


H(U + XV) + H(U + V + W) - H(U + V) 


^ 1+ sup 


ma x{H(U + XV),H(U + V + W)j 
H(U + XV) - H(JJ + V) 


^ 1 + sup 
u, v 

= 2 - inf 


uyw ma x{H(U + XV), H{U + V + W)} 
H(U + XV) - H(U + V) 


H{U + XV) 
H(U + V) 


( 66 ) 

(67) 

( 68 ) 

(69) 


u,v H(U + XV) ’ 

where we used the fact that the supremum in (67) is non-negative (as seen, e.g., by choosing U to be 
non-deterministic and V deterministic) and hence invoking max{ H(U + XV),H(U + V + 11')} ^ 
H(U + XV) in the denominator of (67) yields the upper bound (68). 

For the converse paid, let U, V be independent discrete random variables such that H (U + XV) > 0. 
We take W to be discrete, independent of U and V, and to satisfy 


H(W) ^ H(U + XV), 


(70) 


e.g., we may simply choose W to be uniformly distributed on a sufficiently large finite set. Applying 
Proposition 1 with W\ = U, W 2 = V, W 3 = W, and r := 2 ~ H ( U+XV \ we obtain 

j a 

\H(u + wy j 

H{U + V) 


min ^ 


H(U) A . f H(U + XV) 1 
< -1 > + mm< —--. 1 

\h(u + xvy / \h(u + xvy 


— min/ 


+ mm 


H{U + V + W) 


, 1 > — min 


H(U + XV)' 


, 1 > ^ DoF(Ha). (71) 


H(U + XV) 

Since H(U + V + W) ^ H(W) ^ H(U + XV), where the first inequality is by [19, Ex. 2.14] and 
the second by the assumption (70), we get from (71) that 

H(U + V) 


2 — min 


1 1 < DoF(H 


x) 


(72) 


H{u + xvy 

We treat the cases H(XJ+V) > H(XJ+XV) and H(U+V) ^ H(U + XV) separately. If H(U + V) > 
H(U + XV), then 

H(U + V) 


H(U + V) 

2 - -r-rrr <1 = 2- min 


H(U + XV )' 


H(U + XV) 

On the other hand, if H(U + V) ^ H(U + A17), (72) becomes 

2 H(U + V) 

H(U + XV) 

Combining (73) and (74), we finally get 

2 _ H(U + V) 


,1k DoF(H A ). 


< DoFfH 


x ■ 


< DoFfH 


x 


(73) 


(74) 


(75) 


H(U + XV) 

for all independent U, V such that H(U + AC) > 0. Taking the supremum in (75) over all admissible 
U and V completes the proof. ■ 
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Through Theorem 4 we reduced the DoF-characterization of H A to an optimization of the ratio 
of the entropies of two linear combinations of discrete random variables. This optimization problem 
has a counterpart in additive combinatorics, namely the following sumset problem: find finite sets 
uy C M such that the relative size 


IK + VI , 7 ~ 

| U + AV| 

of the sumsets U + V and U + AV is minimal. The additive combinatorics literature provides a 
considerable body of useful bounds on (76) as a function of \U\ and |V| [17]. A complete answer to 
this minimization problem does, however, not seem to be available. Generally, finding the minimal 
value of sumset quantities as in (76) or corresponding entropic quantities, i.e., H(U + V)/H(U + AV) 
in this case, appears to be a very hard problem, which indicates why finding the exact number of 
DoF of channel matrices with rational entries is so difficult. 

The formal relationship between DoF characterization and sumset theory, by virtue of Theorem 3, 
goes beyond H with rational entries and applies to general H. The resulting linear combinations one 
has to deal with, however, quickly lead to very hard optimization problems. 

We finally show how our alternative DoF-characterization can be put to use to improve the best 
known bounds on DoF(H^) for A = —1. Similar improvements are possible for other values of A. 
For brevity we restrict ourselves, however, to the case A = — 1. 

Proposition 2: We have 


1.13258 ^ DoF(H_i) ^ -. 

3 


Proof: For the lower bound, we choose U and V to be independent and distributed according to 

P[U = 0] = P[V = 0] = (0.08) 3 
P[U = 1] = P[V = 1] = (0.08) 2 
P [U = 2] = P[V = 2] = 0.08 


P [U = 3] = P[V = 3] = 1 - 0.08 - (0.08) 2 - (0.08) 3 . 


This choice is motivated by numerical investigations, not reported here. It then follows from (64) that 

DoF(H_!) ^ 2 - = 1.13258. (77) 

A more careful construction of U and V should allow improvements of this lower bound. 

For the upper bound, let U and V be independent discrete random variables such that H(U—V) > 0 
as required in the infimum in (64). Recall the entropy inequalities (17) and (18) stating that 

H(U-V)^3H(U + V)-H(U)-H{V) (78) 

H{U -V) f \H{U + V) + \{H(U) + H(V)). (79) 
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Multiplying (78) by 2/3 and adding the result to (79) yields 

h -H{U-V)^ b -H(U + V), 


and hence 


Using (80) in (64), we then obtain 


H{U + V) ^ 2 
H(U -V) ^ 3' 


( 80 ) 


which completes the proof. ■ 

The bounds in Proposition 2 improve on the best known bounds obtained in [3, Thm. 11] 13 as 

i)<S- 


1.0681 < DoF(H_i) < l. 
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